2,831 research outputs found
Universal Reinforcement Learning Algorithms: Survey and Experiments
Many state-of-the-art reinforcement learning (RL) algorithms typically assume
that the environment is an ergodic Markov Decision Process (MDP). In contrast,
the field of universal reinforcement learning (URL) is concerned with
algorithms that make as few assumptions as possible about the environment. The
universal Bayesian agent AIXI and a family of related URL algorithms have been
developed in this setting. While numerous theoretical optimality results have
been proven for these agents, there has been no empirical investigation of
their behavior to date. We present a short and accessible survey of these URL
algorithms under a unified notation and framework, along with results of some
experiments that qualitatively illustrate some properties of the resulting
policies, and their relative performance on partially-observable gridworld
environments. We also present an open-source reference implementation of the
algorithms which we hope will facilitate further understanding of, and
experimentation with, these ideas.Comment: 8 pages, 6 figures, Twenty-sixth International Joint Conference on
Artificial Intelligence (IJCAI-17
Fine-Tuning Language Models via Epistemic Neural Networks
Large language models are now part of a powerful new paradigm in machine
learning. These models learn a wide range of capabilities from training on
large unsupervised text corpora. In many applications, these capabilities are
then fine-tuned through additional training on specialized data to improve
performance in that setting. In this paper, we augment these models with an
epinet: a small additional network architecture that helps to estimate model
uncertainty and form an epistemic neural network (ENN). ENNs are neural
networks that can know what they don't know. We show that, using an epinet to
prioritize uncertain data, we can fine-tune BERT on GLUE tasks to the same
performance while using 2x less data. We also investigate performance in
synthetic neural network generative models designed to build understanding. In
each setting, using an epinet outperforms heuristic active learning schemes
- …